Skip to content

[Single File] Add single file support for Flux Transformer#9083

Merged
DN6 merged 5 commits into
mainfrom
flux-single-file
Aug 7, 2024
Merged

[Single File] Add single file support for Flux Transformer#9083
DN6 merged 5 commits into
mainfrom
flux-single-file

Conversation

@DN6
Copy link
Copy Markdown
Collaborator

@DN6 DN6 commented Aug 5, 2024

What does this PR do?

Add single file support for the Flux Transformer model.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 DN6 changed the title [Single File] Add single file support for Flux [Single File] Add single file support for Flux Transformer Aug 5, 2024
Comment on lines +1877 to +1878
mlp_ratio = 4.0
inner_dim = 3072
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be okay to hardcode these no? They are the same across both models. We can grab inner_dim from the checkpoint, but not sure about mlp_ratio?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @yiyixuxu

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think okay to hard code. I would just define them as proper constants at the top of the file.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if they're only applied within the scope of this function, they don't need to exist as constants that can be accessed globally?

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

mlp_ratio = 4.0
inner_dim = 3072

# in SD3 original implementation of AdaLayerNormContinuous, it split linear projection output into shift, scale;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leave outside of the function no since SD3 also uses it?

Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Two things:

  • Let's add it to the list of models that support single file loading.
  • Let's see if FP8 support works? If so, let's attach a code snippet?

@sayakpaul
Copy link
Copy Markdown
Member

@DN6 let's merge this after the TODOs?

@DN6 DN6 merged commit e1b603d into main Aug 7, 2024
sayakpaul added a commit that referenced this pull request Dec 23, 2024
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants